Inclass - Lab
(Day 4)

Let us import the required libraries.

In [2]:
# import the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Let's begin with some hands-on practice exercises

1. Plots using Library Matplotlib

a. Scatter Plots

1. Plot a scatter plot for the following data. Also add title and axis names
        bmi = (18, 34, 54, 45, 45, 45, 23, 23, 54)
        medical_claim = (145, 456, 764, 234, 156, 786, 345, 455, 675)
In [6]:
# type your code here
bmi = (18, 34, 54, 45, 45, 45, 23, 23, 54)
medical_claim = (145, 456, 764, 234, 156, 786, 345, 455, 675)
plt.scatter(bmi,medical_claim)
plt.title('Scatter Plot')
plt.xlabel('BMI')
plt.ylabel('Medical_Claim')
plt.show()
2. Write a code to get the following output

In [49]:
# type your code here
#plt.figure(figsize=(15,8))
x=np.linspace(0,20,100)
y=np.cos(x)
plt.plot(x,y,'-ok',color='black')
plt.title('Cosine Curve')
plt.xlabel('X')
plt.ylabel('Cosine(X)')
plt.show()
3. Create a bubble plot. Further use the parameter 'alpha' to adjust the transparency level. (Generate the data using random number)

Note: A bubble chart is a type of scatter plot that represents three dimensional data. The third variable is represented by the size of a point (marker).

In [3]:
# type your code here
f = plt.figure(figsize=(15,8), dpi=100)
np.random.seed(23)
x_alpha = np.random.randint(low = 0,high = 51,size = 50)
y_alpha = np.random.randint(low = 0,high = 51,size = 50)
p3 = plt.scatter(x=x_alpha,y=y_alpha,s=(y_alpha**1.8),alpha = 0.6)
plt.show()
4. Plot cosine curve and exponential curve for natural numbers from 0.1 to 20, in the same plot (get the following output)

In [3]:
# type your code here
x = np.linspace(0.1, 20, 100)
cosx = np.cos(x)
expx = np.exp(x)
f, ax1 = plt.subplots()

ax1.grid(False)
ax1.plot(x, expx, color='blue', label='Expo curve', linewidth=2)
plt.xlabel('x')
plt.ylabel('Exp(x)')

ax2 = ax1.twinx()
ax2.grid(False)
ax2.plot(x, cosx, color='red', label='Cosine curve', linewidth=2)
plt.ylabel('Cosine(x)')

plt.tight_layout()
plt.show()

b. Barplots

5. The exports and imports (in billion dollars) is given for a country from 2001 to 2005. Draw a barplot for the data
Year Import Export
2001 54.4 42.5
2002 53.8 44.5
2003 61.6 48.3
2004 74.15 57.24
2005 89.33 69.18
In [2]:
# type your code here
Year=[2001,2002,2003,2004,2005]
Import=[54.4,53.8,61.6,74.15,89.33]
Export=[42.5,44.5,48.3,57.24,69.18]
x=np.arange(len(Year))
width=0.4
fig,ax=plt.subplots()
bars1=ax.bar(x-width/2,Import,width,label='Imports')
bars2=ax.bar(x+width/2,Export,width,label='Exports')
plt.xticks([tick for tick in range(len(bars1))],Year)
plt.xlabel('Year')
plt.ylabel('Import and Export(in Billion Dollars)')
plt.legend()
plt.show()

c. Pie plots

6. Plot a pie chart for the following data
    prices = [50, 25, 50, 20]
    labels = ['Apple', 'Orange','Banana', 'Mango']
In [43]:
# type your code here
prices = [50, 25, 50, 20]
labels = ['Apple', 'Orange','Banana', 'Mango']
plt.pie(prices,labels=labels,autopct='%1.2f%%',radius=0.75,explode=[0,0,0,0])
plt.show()
7. Following is the information release after the Indian budget is declared. Plot a donut chart for it
Income Form Amount (in paise)
Corporation Tax 21
Income Tax 16
Customs 4
Union Excise Duties 8
GST & Other Taxes 19
Non-Tax Revenue 9
Non-Debt Capital Receipts 3
Borrowings and Other Liabilities 20
In [4]:
# type your code here
f = plt.figure(figsize=(10,8), dpi=100)
income_form = ["Corporation Tax","Income Tax","Customs","Union Excise Duties","GST & Other Taxes","Non-Tax Revenue","Non-Debt Capital Receipts","Borrowings and Other Liabilities"]
amount = [21,16,4,8,19,9,3,20]
p7 = plt.pie(x = amount,labels = income_form,autopct="%1i%%")
plt.show()

d. Line plots

8. Create a tuple of numbers 12, 34, 54, 45, 45, 45, 23, 23, 54 and plot a simple line plot
In [5]:
# type your code here
t1 = (12,34,54,45,45,45,23,23,54)
plt.plot(t1,"-ok")
plt.title("tuple of numbers")
plt.show()
9. Import the 'Returns.csv'. Plot a multiple line plot for the returns of each company
In [7]:
# type your code here
d9 = pd.read_csv("Returns.csv")
d9.TCS.plot(kind = "line")
plt.title("TCS")
plt.show()
d9.HDFCBANK.plot(kind = "line")
plt.title("HDFCBANK")
plt.show()
d9.JINDALSTEL.plot(kind = "line")
plt.title("JINDALSTEL")
plt.show()
d9.TATAMOTORS.plot(kind = "line")
plt.title("TATAMOTORS")
plt.show()
d9.INFY.plot(kind = "line")
plt.title("INFY")
plt.show()
d9.BOSCHLTD.plot(kind = "line")
plt.title("BOSCHLTD")
plt.legend()
plt.show()

2. Plots using Library Seaborn

10. Plot a strip plot using inbuilt data-set 'iris' given in seaborn
In [56]:
# type your code here
sns.set(style='whitegrid')
iris=sns.load_dataset('iris')
sns.stripplot(x='species',y='sepal_length',data=iris)
plt.title('strip plot')
plt.show()
11. Import the dataset age_height.csv. Plot a heat map for the correlation matrix
In [59]:
# type your code here
df=pd.read_csv('age_height.csv')
corr=df.corr()
sns.heatmap(corr,annot=True)
plt.title('Corelation Matrix')
plt.show()
12. Plot a swarmplot using inbuilt data-set 'iris' given in seaborn
In [65]:
# type your code here
sns.swarmplot(x=iris.species,y=iris.sepal_length,data=iris)
plt.title('Swarmplot')
plt.show()
13a. Using the following data, plot a scatter plot with the function available in seaborn
        x = (12, 34, 54, 45, 45, 45, 23, 23, 54)
        y = (145, 456, 764, 234, 156, 786, 345, 455, 675)
In [70]:
# type your code here
x = (12, 34, 54, 45, 45, 45, 23, 23, 54)
y = (145, 456, 764, 234, 156, 786, 345, 455, 675)
sns.scatterplot(x=x,y=y)
plt.title('Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()
13b. Using the data generated in question 3, plot a scatter plot with the function available in seaborn
In [8]:
# type your code here
np.random.seed(23)
x_alpha = np.random.randint(low = 0,high = 51,size = 50)
y_alpha = np.random.randint(low = 0,high = 51,size = 50)
p13 = sns.scatterplot(x=x_alpha,y=y_alpha,size=(x_alpha**0.5),alpha = 1,color = "red")
p14 = p13 = sns.scatterplot(x=x_alpha,y=y_alpha,size=(y_alpha**3),alpha = 1,color = "blue")
plt.legend()
plt.show()
14. Using the 'diamonds' data available in the library seaborn, consider the price for the type of cut. Display the data using the following plots:
  1. Multiple boxplot
  2. Violin Plot
  3. Boxen Plot
In [4]:
# type your code here
diamonds=sns.load_dataset('diamonds')
sns.catplot(x='cut',y='price',kind='box',data=diamonds)
plt.title('Multiple Boxplot')
plt.show()
In [80]:
sns.catplot(x='cut',y='price',kind='violin',data=diamonds)
plt.title('Violin Plot')
plt.show()
In [84]:
sns.catplot(x='cut',y='price',kind='boxen',data=diamonds)
plt.title('Boxen Plot')
plt.show()
15. Generate random number form normal distribution, plot the following:
  1. Histogram
  2. Histogram with frequency curve
In [9]:
# type your code here
#x=np.arange.normal(size=100)
np.random.seed(23)
p15 = np.random.randn(100)
sns.distplot(p15,kde = False,color = "blue",rug=True, kde_kws={"shade": True})
plt.title("distribution")
plt.xlabel("x")
plt.ylabel("y")
plt.show()
C:\Users\saira\anaconda3\lib\site-packages\seaborn\distributions.py:2551: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
C:\Users\saira\anaconda3\lib\site-packages\seaborn\distributions.py:2055: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
In [10]:
np.random.seed(23)
p15 = np.random.randn(100)
sns.distplot(p15,kde = True,rug=True,color ="red",kde_kws={"shade": True})
plt.show()
C:\Users\saira\anaconda3\lib\site-packages\seaborn\distributions.py:2551: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
C:\Users\saira\anaconda3\lib\site-packages\seaborn\distributions.py:2055: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)

3. Plots using Library Plotly

16. Use the 'diamonds' data set from seaborn to plot the histogram of price by colors
In [27]:
# type your code here
sns.histplot(x=diamonds.color,y=diamonds.price)
plt.hist(x=[diamonds.price,diamonds.color])
Out[27]:
(array([[23328., 10257.,  7852.,  4176.,  2498.,  1831.,  1344.,  1028.,
           883.,   743.],
        [53940.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
             0.,     0.]]),
 array([    0. ,  1882.3,  3764.6,  5646.9,  7529.2,  9411.5, 11293.8,
        13176.1, 15058.4, 16940.7, 18823. ]),
 <a list of 2 BarContainer objects>)
17. Plot a histrogram for 500 random numbers which have mean 0 and variance 1
In [19]:
# type your code here
18. Using the 'tips' data from seaborn, plot a violin plot for 'tips' based on the 'sex'
In [13]:
# type your code here
import plotly.express as px
p18 = sns.load_dataset('tips')
px.violin(data_frame=p18,y = "sex",title= "violin plot")
19. Using the tips dataset from seaborn, plot a donut plot with legend representing the percentage of the tip got on that day
In [16]:
# type your code here
p19 = sns.load_dataset('tips')
px.pie(data_frame = p19,names = p19.day,values = p19.tip,title = "piechart",labels = ["day","tip"])
20. The exports and imports (in billion dollars) is given for a country from 2001 to 2005. Draw a barplot for the data
Year Import Export
2001 54.4 42.5
2002 53.8 44.5
2003 61.6 48.3
2004 74.15 57.24
2005 89.33 69.18
In [17]:
# type your code here
p20 = pd.DataFrame({"year" : ["2001","2002","2003","2004","2005"],
                    "imports" : [54.4,53.8,61.6,74.15,89.33],
                    "exports" : [42.5,44.5,48.3,57.24,69.18]})
px.bar(data_frame = p20,x = p20.year,y = [p20.imports,p20.exports],barmode ="overlay" )
In [ ]: